Privacy-Preserving Synthetic Educational Data Generation. (arXiv:2207.03202v1 [cs.CY])
Institutions collect massive learning traces but they may not disclose it for
privacy issues. Synthetic data generation opens new opportunities for research
in education. In this paper we present a generative model for educational data
that can preserve the privacy of participants, and an evaluation framework for
comparing synthetic data generators. We show how naive pseudonymization can
lead to re-identification threats and suggest techniques to guarantee privacy.
We evaluate our method on existing massive educational open datasets.
( 2
min )